Split Selection Methods for Classi cation

نویسنده

  • Yu-Shan Shih
چکیده

Classiication trees based on exhaustive search algorithms tend to be biased towards selecting variables that aaord more splits. As a result, such trees should be interpreted with caution. This article presents an algorithm called QUEST that has negligible bias. Its split selection strategy shares similarities with the FACT method, but it yields binary splits and the nal tree can be selected by a direct stopping rule or by pruning. Real and simulated data are used to compare QUEST with the exhaustive search approach. QUEST is shown to be substantially faster and the size and classiication accuracy of its trees are typically comparable to those of exhaustive search.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An extensive comparison of recent classification tools applied to microarray data

Since most classi%cation articles have applied a single technique to a single gene expression dataset, it is crucial to assess the performance of each method through a comprehensive comparative study. We evaluate by extensive comparison study extending Dudoit et al. (J. Amer. Statist. Assoc. 97 (2002) 77) the performance of recently developed classi%cation methods in microarray experiment, and ...

متن کامل

Feature Selection and Dualities in Maximum Entropy Discrimination

Incorporating feature selection into a classi cation or regression method often carries a number of advantages. In this paper we formalize feature selection speci cally from a discriminative perspective of improving classi cation/regression accuracy. The feature selection method is developed as an extension to the recently proposed maximum entropy discrimination (MED) framework. We describe MED...

متن کامل

Navigala: an Original Symbol Classifier Based on Navigation through a Galois Lattice

This paper deals with a supervised classi ̄cation method, using Galois Lattices based on a navigation-based strategy. Coming from the ̄eld of data mining techniques, most literature on the subject using Galois lattices relies on selection-based strategies, which consists of selecting/ choosing the concepts which encode the most relevant information from the huge amount of available data. Generall...

متن کامل

Extracting fuzzy classi cation rules with gene expression programming

In essence, data mining consists of extracting knowledge from data. This paper proposes an evolutionary system for discovering fuzzy classi cation rules. Fuzzy logic is useful for data mining especially in the case for performing classi cation task. Three methods were used to extract fuzzy classi cation rules using Evolutionary Algorithms: (1) genetic selection small number of large number of f...

متن کامل

Cloud Classi cation Using Error-Correcting Output Codes

Novel arti cial intelligence methods are used to classify 16x16 pixel regions (obtained from Advanced Very High Resolution Radiometer (AVHRR) images) in terms of cloud type (e.g., stratus, cumulus, etc.). We previously reported that intelligent feature selection methods, combined with nearest neighbor classi ers, can dramatically improve classi cation accuracy on this task. Our subsequent analy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997